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ABSTRACT 

We extend models of our Galaxy based on distribution functions (DFs) that are analytic func¬ 
tions of the action integrals to extended distribution functions (EDFs), which have an ana¬ 
lytic dependence on metallicity as well. We use a simple, but physically-motivated, functional 
forms for the metallicity of the interstellar medium as a function of radius and time and for 
the star-formation rate, and a model for the diffusion of stars through phase space to suggest 
the required functional form of an EDF. We introduce a simple prescription for radial migra¬ 
tion that preserves the overall profile of the disc while allowing individual stars to migrate 
throughout the disc. Our models explicitly consider the thin and thick discs as two distinct 
components separated in age. 

We show how an EDF can be used to incorporate realistic selection functions in models, 
and to construct mock catalogues of observed samples. We show that the selection function of 
the Geneva-Copenhagen Survey (GCS) biases in favour of young stars, which have atypically 
small random velocities. With the selection function taken into account our models produce 
good fits of the GCS data in chemo-dynamical space and the Gilmore and Reid (1983) density 
data. 

From our EDF, we predict the structure of the SEGUE G-dwarf sample. The kinemat¬ 
ics are successfully predicted. The predicted metallicity distribution has too few stars with 
[Fe/H] ~ —0.5 dex and too many metal-rich stars. A significant problem may be the lack of 
any chemical-kinematic correlations in our thick disc. We argue that EDFs will prove essential 
tools for the analysis of both observational data and sophisticated models of Galaxy formation 
and evolution. 

Key words: Galaxy: kinematics and dynamics - evolution - abundances - disc - solar neigh¬ 
bourhood. 


1 INTRODUCTION 

Significant observational resources are currently being devoted to 
surveys of our Galaxy from both the ground (APOGEE, LAMOST, 
Gaia-ESO, GALAH) and space (Gaia). Much of this effort centres 
on determining the chemical compositions of stars in addition to 
their phase-space locations. A star carries its chemistry throughout 
its life, so we may hope to infer from it the time and place of its 
birth. In particular, chemistry is the only indicator of radial migra¬ 
tion, a process that has attracted much attention since Sellwood & 
Binney (2002) showed that it is the dominant effect of spiral struc¬ 
ture and Schonrich & Binney (2009) argued that it explains the 
structure of the solar neighbourhood. 

It is much harder to determine the chemical composition of 
a star than to measure its position and velocity, so the density of 
stars in phase space has been extensively studied in the absence of 
chemical data. To a good first approximation our Galaxy should 
be in dynamical equilibrium, so by Jeans’ theorem the phase-space 
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density of stars, f(x, v), should depend on the phase-space coor¬ 
dinates ( x, v) only through constants of stellar motion. There are 
many reasons to prefer action integrals J; (i = 1, 2,3) over other 
constants of motion (e.g. Binney & Tremaine 2008, §4.6), so the 
natural first step in the interpretation of a Galaxy survey is to relate 
the data to a distribution function (DF) of the form f(J). 

Binney (2012b, hereafter B12) showed that models in which 
f(J) is an analytic function are remarkably successful in repro¬ 
ducing data from the Geneva-Copenhagen survey (hereafter GCS) 
(Nordstrom et al. 2004; Holmberg et al. 2009). Subsequently, Bin¬ 
ney et al. (2014) showed that a DF f(J) that had been fitted to the 
GCS predicts the kinematics of stars in the RAdial Velocity Exper¬ 
iment (RAVE, Steinmetz et al. 2006) with remarkable success. Re¬ 
cently Piffl et al. (2014) obtained the tightest constraints yet on the 
Galaxy’s dark-matter distribution by fitting RAVE data with DFs 
of the form introduced by Binney & McMillan (2011) under the 
assumption of a variety of trial Galactic potentials 4?(a:). 

In the work of B12 and Piffl et al. (2014), the implicit assump¬ 
tion was that the probability of a star being included in a survey is 
independent of its age or metallicity, and varies only with location. 
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This assumption is false, because at a given distance the probabil¬ 
ity that a star will be included in a survey declines with decreasing 
luminosity to zero below a threshold luminosity. Since luminosity 
depends on age, metallicity and mass, the fraction of any coeval 
cohort of stars at a given location that will be included in a survey 
varies with the cohort’s age and metallicity. Moreover, the kine¬ 
matics of any coeval cohort will depend on its age because stars are 
born on nearly circular orbits and drift over time onto more inclined 
and eccentric orbits. Consequently, one can predict the kinematics 
of the stars at some location x that are included in a given survey, 
as distinct from the kinematics of all the stars at x, only if one takes 
into account the dependence of luminosity on age and metallicity. 

The DFs introduced by B12 include age as an internal param¬ 
eter, but make no reference to metallicity. If we are to engage with 
the luminosities of stars through isochrones, we must recognise that 
every chemically distinguishable population of stars has its own DF. 
We could proceed by seeking a DF fz ( J ) for each discrete bin, la¬ 
belled by Z , in chemistry space. Alternatively, we could make the 
distribution function a continuous function of chemistry by writing 
Here we do the latter and call the function /( J. Z) an ex¬ 
tended distribution function (EDF). Hoping to emulate the success 
that analytic DFs have had, in this paper we introduce an analytic 
form for the Galaxy’s EDF. 

The discrete approach was advocated by Bovy et al. (2012c,b), 
who argued that sub-populations of SEGUE G dwarfs defined by 
cells in ([Fe/H], [a/Fe]) space have simple spatial and kinematic 
structures. This discrete approach relieves one of the necessity of 
picking a functional form for the EDF that is consistent with the ac¬ 
tual density of stars in data space. However, discretisation has three 
drawbacks. First, choosing bin sizes always requires a compromise 
between losing the information contained in the position of each 
datum within its bin and increasing Poisson noise by making the 
bins small. Second, since the distribution of stars in chemical space 
is established by a large number of enrichment events, we have 
reason to believe it smooth. If we permit the data-fitting routine 
to choose a discontinuous distribution, we run the risk of masking 
astrophysically important signals in the data. Third, we require er¬ 
rors in the ([Fe/H], [a/Fe]) space that are much smaller than the 
bin sizes, otherwise we are neglecting the possibility of contamina¬ 
tion on each bin by neighbouring bins. Additionally, a continuous 
parametrization allows for a rigorous treatment of the error distri¬ 
butions in ([Fe/H], [a/Fe]) and how these errors correlate with the 
kinematic errors. Hence we believe that it is best to work with an 
EDF provided we are confident that we have a sufficiently flexible 
and well-tailored functional form. 

A very useful working hypothesis is that all disc stars were 
born near the plane from an interstellar medium (ISM) that has only 
a radial abundance gradient. With this hypothesis chemical compo¬ 
sition provides a clue to the radius and time of a star’s formation, 
because the chemical composition of the ISM has evolved over the 
life of the Galaxy from very a-rich and metal-poor to solar-type a 
abundances and metal-rich, with the evolution being expected to be 
fastest and most effective at small radii. Hence the chemical dis¬ 
tribution of stars at any location x is intrinsically interesting, and 
by upgrading a model’s DF to an EDF we gain the ability to predict 
measured chemical distributions. 

In Section 2, we introduce the functional form of our EDFs. 
In Section 3, we discuss the data that will be relevant for our ex¬ 
periments with EDFs. In Section 4, we discuss selection functions 
and how one can model the kinematics of a data set without ex¬ 
plicitly modelling the selection function. In Section 5, we go on to 
fit the parameters of our extended distribution function to the GCS 


data. In Section 6, we construct mock catalogues for the GCS and 
SEGUE G dwarfs from the extended distribution functions. Sec¬ 
tion 7 sums up and looks to the future. 


2 MODEL 

In principle, an EDF is the joint distribution function of the phase- 
space coordinates (x, v), and any additional properties of each star, 
such as ([Fe/H], [a/Fe], T/g, log g,. ..). Here we extend the usual 
DF to include just metallicity [M/H] - for simplicity we assume 
zero a-enhancement for the stars, so [M/H] and [Fe/H] can be 
used interchangeably. We start from the DF introduced by B12 and 
take inspiration from Schonrich & Binney (2009, hereafter SB09), 
who model in full the joint chemical and dynamical evolution of 
the Galactic disc under the conventional assumption that at a given 
radius r and age r the interstellar medium (ISM) has a well-defined 
metallicity [Fe/H](r, r). Chemical evolution models based on this 
assumption have a long history (Matteucci & Francois 1989; Chi- 
appini et al. 2001) but the importance of radial mixing for these 
models has only been realised in recent years beginning with the 
SB09 models. 

In the following subsections we develop our modelling ap¬ 
proach culminating in the EDF specified by equation (21). The use¬ 
fulness of this EDF in no way depends on the correctness of the 
arguments we use below to motivate its construction. Hence busy 
readers may skip forwards to equation (21 ) and on to Section 3, and 
thus bypass our account of the chain of reasoning that leads us to 
propose equation (21). 

2.1 ISM metallicity 

In this section we specify a functional form for the metallicity of the 
ISM [Fe/H] (r, r). Fig. 6 of SB09 shows the evolution of [Fe/H] in 
the ISM at a grid of radii. The models of SB09 include full chem¬ 
ical evolution as well as gas accretion and flows. However, despite 
the complexity of the input model, the resulting form for the ISM 
metallicity as a function of radius and time is remarkably simple. 
We find that this evolution can be quite well fitted by the functional 
form 

[Fe/H] (r, r) = F(r,r) 

= [F(r) - F m ] tanh + F m , ' ’ ’ 

where r m = 12 Gyr and F m are, respectively, the age and the 
metallicity of the oldest stars. Hence, like SB09, we assume that 
the protogalactic material was pre-enriched to some small metal¬ 
licity F m . The parameter tf controls the rate at which metallicity 
increased at early times, when r ~ r m and tanh[(r m — t)/tf) ^ 
(Tm — t)/tf ■ The ratio T m /rF is large enough that the tanh 
function is essentially unity for recently-born stars (r <C r m ), 
so [Fe/H](r,0) ~ F(r). Hence the function F(r) describes the 
metallicity of the ISM at the current time. 

We adopt a current metallicity-radius relation 

F(r) = F m (l- exp ^ ]) ■ (2) 

where Fr is a new constant. Near the Sun the argument of the 
exponential is small so F(r) ~ Fr(t — ff). It follows that Fr 
is the current metallicity gradient within the ISM at the Sun, and 
is negative. At large radii the argument of the exponential becomes 
large and negative so the metallicity tends to F m at all radii. Note 
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Figure 1. Metallicity against age for our EDF. Each line shows metallicity 
against age for birth radii linearly spaced by 1 kpc between the uppermost 
line, corresponding to r = 0 kpc, and the lowest line, corresponding to 
r = 30 kpc. The red dashed curve corresponds the solar radius and the 
vertical blue dashed line shows our chosen age divide between the thin and 
thick discs. 


that when we have set F m , Fr and vf the highest metallicity in the 
disc is set as/^(O) = F m (l — exp[FurF/Fm]). Also, in this model 
the ISM radial metallicity gradient becomes steeper with time. 

Fig. 1 shows metallicity against age for a series of birth radii 
for the choice of parameters that will emerge in Section 5. This 
plot may be compared with Fig. 6 of SB09. In the SB09 model 
the radial metallicity gradient Fr ~ —0.082 dex/ kpc is steeper 
than in most competing models (e.g. Wang & Zhao 2013; Minchev 
et al. 2014). Note that this gradient was not fitted to the GCS data 
by SB09. but instead comes from a fit to open cluster data. 


2.2 Quasi-isothermal distribution function 

We start from the DF of the disc introduced by B12. Its fundamental 
building block is the quasi-isothermal DF: 

MQ) = ^[l + tanh(J 0 /L o )]^e-^/ fld 

X K c~ KJr/a " V c 

<?r erf 

Each quasi-isothermal is controlled by four parameters, Q = 
( Rd, R a , c r0 , cr-o), to be introduced later. Here Cl(R c ), k(R c ) and 
i'(.Rc) are the circular, radial and vertical frequencies of the cir¬ 
cular orbit of radius R C (Jf). We choose positive Galactocentric 
radial velocity, vr, to be away from the Galactic Centre and posi¬ 
tive azimuthal velocity, v#, and hence positive angular momentum, 
J 4 ,, to be in the direction of Galactic rotation. With this conven¬ 
tion ( vr , Vt/yjVz) form a left-handed coordinate system. The factor 
containing tanh eliminates retrograde orbits (negative J© with the 
constant Lo = lOkms -1 kpc. The extent of random motions is 
controlled by the velocity-dispersion functions 

(jfRc) = c;o e < ' Ro ~ Rc)/R " 7 (i = r, z ), (4) 

where R a ~ 2Rd is a parameter that sets the scale of the outward 
decline in velocity dispersion within the disc. Numerically, ato is 
nearly equal to the dispersion of v. L near the Sun. 

The actions J = ( J r , J^, J z ) given (x,v) are found using 
the axisymmetric “Stackel fudge” algorithm presented in Binney 


Table 1. Parameters of the Galactic potential used throughout the paper. 


Thin 

R p / kpc 

2.4 


z d / kpc 

0.36 


IMq pc -2 

1106 

Thick 

R p /kpc 

2.4 


z d / kpc 

1. 


Ed /M 0 pc“ 2 

73 

Gas 

R p /kpc 

4.8 


z d /kpc 

0.04 


Ed IM e pc“ 2 

114 


Rm / kpc 

4 

Bulge 

Po /Mq pc”* 

0.76 


r 0 /kpc 

1 


7 

1.8 


ft 

1.8 


q 

0.6 


Tt / kpc 

1.9 

Halo 

PO IM e pc-* 

1.26 


ro / kpc 

1.09 


7 

-2 


ft 

2.21 


q 

0.8 


rt t kpc 

OO 


( 2012 a) with an adaptive choice of the coordinate system param¬ 
eter, A, using the procedure described in the appendix of Binney 
(2014). Note that as the the potential is axisymmetric is the z- 
component of the angular momentum. 


2.3 Galactic potential 


Throughout the paper we use an adjusted version of Potential II 
from Dehnen & Binney (1998) that consists of a thin and thick disc, 
a gas disc and two spheroids representing the bulge and the halo. 
This potential is purely axisymmetric and as such does not fully 
model the central bar of the Galaxy. We have increased the scale- 
height of the thin disc to 360 pc and increased the mass of the thin 
disc such that the circular velocity at the solar radius (Ro = 8 kpc) 
is 220 km s _ 1 . The functional form for the discs is given by 


Pd{R,z) 


— exD ( - — _ — - M3 
2za R R P Zd / 


(5) 


where R v is the scale-length, Zd the scale-height, Ed is the central 
surface density and R m controls the size of the hole at the centre 
of the disc which is only non-zero for the gas disc. The spheroids 
obey the functional form 


p s (m) = p°(^) 7 ( 


H- 

r o 


m\ ~t-P i 
) exp ^ 



(6) 


where m = (R? + q~ 2 z 2 f^ 2 . po is the scale density, ?’o a scale- 
length, q a flattening, 7 and /3 control the inner and outer slopes, 
and rt is a truncation radius. The adopted parameters are given in 
Table 1. 


2.4 Thin-thick disc decomposition 

In the B12 models the disc’s DF is a sum of contributions from the 
thin and thick discs 

/(J) = /thn(J) + /thk(J) (7) 
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Figure 2. Star-formation rate (SFR) as a function of age of the Galaxy. 
The blue dashed line shows our chosen divide between the thin and thick 
discs. The SFR at the birth of the Galaxy is zero, is a maximum around the 
thin/thick disc divide and then declines exponentially with time constant 
Tf =8 Gyr. 


and the DF of each sub-disc is a sum over coeval cohorts, each 
cohort having a quasi-isothermal DF: 


f a (J) = J dr f a (J, t) 

= [ dr r a (r)/(J|Q Q (r)) (a = thn.thk). 


( 8 ) 


The star-formation rates, F a , are given by 


a preliminary EDF: 

f a (J,[ Fe/H]) = f drr a (r)/(J|Q a (r))«5[[Fe/H]-F(fl c ,r)]. 

( 12 ) 

This equation states that all stars were born cold in the disc at some 
radius r — R c with some metallicity [Fe/H], which together define 
a unique age. The stars then heat over time, but their guiding radii 
remain fixed. 


2.5 Radial migration 

We now introduce changes in the angular momenta of stars. At a 
corotation resonance with some non-axisymmetric feature, be it a 
molecular cloud, a spiral arm or a bar, temporary trapping of a star 
by the resonance can cause an abrupt shift in angular momentum 
while leaving the star’s other two actions, J r and J- unchanged. 
Sellwood & Binney (2002) named this process “churning” and ar¬ 
gued that such angular-momentum changes from at birth to 
now bring to the solar neighbourhood metal-rich stars born in the 
inner Galaxy and metal-poor stars born in the outer Galaxy. 

Churning is just one aspect of the diffusion of stars through 
action space - diffusion parallel to the axis, whereas heating 
arises from diffusion perpendicular to this axis. Consequently, the 
proper procedure for constructing an EDF is to assume a form 
for the DF of each coeval cohort at birth (e.g. a cold exponential 
disc as in SB09), and to convolve these with the Green’s function 
G(J , J' ,t) that solves the action-space diffusion equation, (e.g. 
Binney & Tremaine 2008, §7.4.2(c)) 


Fthn ("r) 


r t hk(r) 


F(r) 

if r < tt, 

0 

otherwise. 

F(r) 

if TT ^ T ^ T m 

0 

otherwise. 


(9) 


Thus all stars formed prior to tt = 10 Gyr are thick-disc stars and 
all younger stars are thin-disc stars. The global star-formation rate 
is given by 




GO) 


where ‘S is a normalization constant that must be found numeri¬ 
cally and Tf S> t 3 . The form of the distribution was chosen such 
that the star-formation rate increases from zero at the birth of the 
Galaxy to a maximum at r = r m — d TsT f an d decays exponen¬ 
tially until the current time. Motivated by the results of Aumer & 
Binney (2009) we set r/ = 8 Gyr and leave r s as a free parameter 
that controls the thin-thick disc ratio. Fig. 2 shows the form of r(r) 
for the parameters chosen in Section 6. 

As in B12 the velocity-dispersion parameters of the thick disc 
(r > tt) are independent of age. Hence the age of a thick-disc star 
affects its chemistry but not its kinematics. The parameters of the 
thin disc depend on age according to the prescription 


/ r + ri x' 

Cio(r) = - am. (thin disc) (11) 

\t t + T\ J 

We model the stochastic heating of the thin disc by setting f3 r = 
0.33 and = 0.4, and set n = 110 Myr. 

Now that we have defined the metallicity of the ISM as a func¬ 
tion of radius and age, and the DFs as functions of age, we can write 


df 

dt 


d_ 

dJ 


(~D w f + D (2) 



(13) 


for a delta-function source of stars at J 1 a time r in the past. Here 
£>(!) 

is a vector of first-order diffusion coefficients and D (2) is a 
symmetric matrix of second-order diffusion coefficients. That is, 
we ought to write the general solution to the diffusion equation in 
the form 


f(J,r) = j dVG(J, J',T)fo(J',T), 


(14) 


where G(J, J', r) is the probability that a star of age t has scat¬ 
tered from J' to J since its birth, and fo(J', r) is the DF at birth 
of the cohort of age r. G(J, J ', r) is the solution of equation (13) 
that tends to 5( J — J') as r —> 0. 

The orbit-averaged Fokker-Planck equation (13) differs from 
the standard heat conduction equation by the presence on the right 
of the term D {1 ^ df /dJ. This term causes a systematic drift to¬ 
wards the origin of action space that counteracts the tendency of the 
term containing D (2) to cause stars to diffuse to large actions, and 
therefore large energy. In the familiar context of cluster dynamics, 
the term with U (1 ^ describes dynamical friction, while that with 
D u>) drives evaporation. 

To the extent to which we can neglect the participation of gas 
and the dark halo in spiral structure, radial migration should con¬ 
serve the total angular momentum of the stellar disc. This being so, 
the tendency of the term in D u>) to drive stars to large needs 
to be counteracted by a term in D n > generating a drift back to 
Jrj, = 0 . 

In the interests of computational speed, we eliminate the inte¬ 
grals in equation (14) over J r and J z by guessing that their effect 
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can be approximated by the increases in the velocity-dispersion pa¬ 
rameters given by equation (11). We do however evaluate the inte¬ 
gral over J,f, with equation (13) simplified to 


9/ _ 9 ( D W f . °lo 9 / y 

9 1 9 J<$, \ ^ 2r m dj,f,)' 


(15) 


( 2 ) 

Here we have assumed that D\1 is independent of and 

parametrized it in a form that implies a random walk in with 
constant steps. If D^ were to vanish, an initial distribution 5( — 

J'f) in J,/, would remain Gaussian as stars diffused through action 
space, with the dispersion evolving in time according to 


&l(t) = — ) 

The Green’s function for equation (15) is 


/ Tm 

r (J 0 - 4-D^t) 2 , 

1 2 ™l 0 CXP 

i 2a|_ 0 t/r m J 


(16) 


(17) 


Evolving a 5-function with this Green’s function produces a Gaus¬ 
sian packet that broadens according to equation (16) and drifts in 
the negative direction with ‘velocity’ D'f'. 

We relate the first-order diffusion coefficient to our cho- 

<p 

( 2 ) 

sen form of by requiring that the DF of the whole Galac¬ 
tic disc is a stationary solution of the Fokker-Planck equation 
(equation 15). In this way the disc does not broaden in time; 
hence, the total angular momentum is conserved. For simplicity 
we model our full DF by a cold exponential disc in radius with 
scale-length Rd in a potential with constant circular speed V c : 
f oc exp[— Jcf,/(V c Rd)]- The diffusive flux (given by the right- 
hand bracket in equation 15) vanishes for the stationary DF pro¬ 
vided 


D 


(i) _ 
& 


a 


2 

L 0 


^TjnVcRd 


(18) 


The Green’s function (17) ensures that the disc’s total angular 
momentum is constant providing the disc is exponential and has 
a flat circular-speed curve. SB09 enforced conservation of J$ by 
making the transition probability between grid points in angular 
momentum depend upon the product of the masses associated with 
those grid points. Unfortunately, this enforcement makes G depend 
upon /, or stated differently, destroys the linearity of the diffusion 
equation by making the diffusion coefficients functions of the DF. 
Our formalism makes the diffusion coefficients depend on / but 
only for the specific case that / is exponential, whilst the SB09 
formalism conserves angular momentum for a general /. We do not 
wish to be so sophisticated in this introduction to the EDF, but our 
simple prescription should capture the relevant global properties of 
radial migration. Recently, Kubryk et al. (2014) have shown that the 
churning resulting from an iV-body simulation can be well fitted 
by a random walk with a spatially varying dispersion, and that the 
resulting spatial profiles for stars born at given radii match the SB09 
results well. It is clear that whilst an individual radial migration 
event may be awkward to model and depend on the exact form 
of the scattering potential, a global picture of radial migration can 
be formed through simple ‘recipes’. The recipe we have presented 
is perhaps oversimplified but we will show it goes a long way to 
accounting for the data. 

Note that if each population is born as a 5-function in J' and 
J' z and an exponential in R! c such that 




T(t)5(4)5(J') 


‘ 2 £l{Jf)R c -R' c /R d 

K 2 (J'+)Rf 


(19) 


and the Green’s function is 


G(J J' t) = 


c _ (jA _ J / | _ 0 a) r) 2 / 2 j 2 (r) 

j 


V 2na l{ T ) 

x c -v(Jd,)(Jz-Jl)/'T z AJcb,T) 


( 20 ) 


then we can perform the integrals over J' and J' using the 5- 
functions, and the resulting distribution function for the population 
of age r is the one given in equation (21). Thus under highly plausi¬ 
ble conditions our prescription for evolving the DF of a population 
by increasing the dispersion parameters and convolving with the 
one-dimensional Green’s function (17) is equivalent to convolving 
with the full three-dimensional Green’s function. 


2.6 Extended distribution function 


With our churning prescription included, the EDF becomes 


-Gt-J^-D^r) 2 /2a 

^f/Tj 

x ^(j;,r)/(J'!Q Q (r))5[[Fe/H] - F(R' c ,r)}, 


U(J,{ Fe/H]) = ^dj; J dr F a (r)- 


( 21 ) 


where J' = ( J r , Jf J z ), R' c = RfJ'fj. Stars are unable to mi¬ 
grate through J 0 = 0 so we have included a normalization factor 
given by 


^(4,T) = 2[l + erf(^=^)] \ (22) 

which ensures (2n) 3 f dj^ dr d[Fe/H] d li J f = 1 as shown in 
Appendix A. Here we introduce the notation 


G{J<h J^t) 


V 2na L 


(23) 


to simplify future expressions. 

In Fig. 3 we show the angular momentum distribution of stars 
born at different radii after they have migrated for 2, 6,12 Gyr us¬ 
ing the parameters of our chosen EDF given in Section 6. For sim¬ 
plicity we use the thin disc scale-length in equation (18) for D^\ 
After 2 Gyr the main feature is the broadening of the distribution 
but on longer timescales we observe more noticeably the drift to¬ 
wards the origin. Additionally, we have plotted the exponential disc 
profile, and we see that by r = 12 Gyr the distributions of the stars 
born innermost are tending towards the exponential. In Section 6 
we will see that our prescription preserves the radial profile of our 
discs. 

The middle and right panels of Fig. 3 make it clear the ex¬ 
tent of radial migration that is required to account for the observed 
breadth of the metallicity distribution in the solar neighbourhood. 
In particular, stars that are only 6 Gyr old and were formed at 
R = 2 kpc have a non-negligible chance to be found in the so¬ 
lar neighbourhood. Recently Kordopatis et al. (2015) argued that 
the metallicity distribution of RAVE stars requires radial migration 
on this, perhaps surprising, scale. 

The approach we have taken lacks a degree of elegance be¬ 
cause we have treated radial migration and heating differently, 
whereas they are in reality just two aspects of diffusion in phase 
space. Specifically, following B12 we have built heating (diffusion 
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Figure 3. Illustration of our radial migration prescription: we show the angular momentum distribution of stars born at radii of 2, 4, 6, 8,10,12 kpc (marked 
by dashed lines) after 2 Gyr (left panel), 6 Gyr (central panel) and 12 Gyr (right panel). The axes above each plot show the approximate guiding-centre radius 
assuming a flat rotation curve of 14 = 220 km s —1 . The black dotted line shows an exponential profile with scale-length R,/ = 3.5 kpc. 


in the J r and J z directions) into the basic DF, while radial migra¬ 
tion (diffusion in the J$ = J,f, direction) has been explicitly in¬ 
cluded. This artificial break in symmetry leads to ambiguity as to 
the value of at which the velocity-dispersion parameters oy and 
tj z should be evaluated. If evaluated at the birth angular momen¬ 
tum J'^, this would imply that the star spent its entire life near its 
radius of birth, while if evaluated at its final angular momentum 
this would imply that it spent its entire life near its present guiding 
radius. In reality, the actions J r and J z that a star acquires reflect 
the intensity of the fluctuations in the gravitational field that it has 
experienced over its entire life, and this intensity will be character¬ 
istic of an angular momentum that is intermediate between J^ and 
J(j)~ 

For a particle undergoing a random walk, the most probable 
path between J$ and J^ is linear such that, on average, a star at the 
Sun has experienced the mean of the heating events at the angular 
momentum passed through. We could therefore opt to evaluate the 
dispersion parameters at the average of birth and current angular 
momentum. However, we choose, for simplicity, to evaluate these 
parameters at the current angular momentum. Experiments using 
the average angular momentum produce very similar results mostly 
because the best-fitting scale-lengths for the velocity dispersions 
are large so the dispersion parameters don’t vary significantly with 
radius. 

Indeed, Minchev et al. (2012) showed from a numerical sim¬ 
ulation that the final velocity dispersions of stars that had migrated 
into a given radial bin matched the final velocity dispersions of stars 
that had spent their entire lives in that bin. Kordopatis et al. (2015) 
have found that an analogous result holds for stars sampled by the 
Radial Velocity Experiment (RAVE). This gives us confidence that 
our approximation is a valid one. 


dF/dR c (r = Tm) = 0, and 3F/3 t(t = 0) « 0. Therefore, for 
the thin disc we use the 5-function to perform the J^ integral and 
for the thick disc we use it to perform the r integral. For the thin 
disc, we obtain 


/thn(J, [Fe/H]> = [ TT dr 
Jo 


r t hn{r)f(J IQthn(f)) - . ,/ , 

|3F/3f?,||3f? c /3J^| ^ ’ 


(24) 


where J ^ is given by F(R c (J' c j > ), r) = [Fe/H], which may be in¬ 
verted analytically. Additionally, we have that 


3 F 
SR C 


(r, r) = F r exp ( 


—Fr(v — Vf) 


) tanh (^)’ 


3f? c _ 

dJc/, f? c K 2 ' 


(25) 


For the thick disc, we have 


t it rev mn _ f°° J T> ^thk{r)f(J'\Qthk) nrT J, ^ 
/thk(J, [Fe/H]) — y dj^ |3_F/3r| G{Jih Jc/n'r), 


(26) 


where 

^ = — (F(r)-F m ) S ech 2 (^^y (27) 

Again we find r by inverting r) = [Fe/H] analytically. 

For convenience, we limit the integration range to ±3<t.lo and per¬ 
form the integral over R' c . These two one-dimensional integrals are 
then performed numerically using a 10-point Gaussian quadrature 
scheme. 


2.7 Performing the integrals 

To evaluate equation (21), we need to evaluate a double integral 
over r and J^. Fortunately, one of these integrals is trivial as the 
integrand contains a 5-function. When we use the 5-function to 
evaluate the integral, we obtain derivatives of F with respect to 
either r or J £ in the denominator of the integrand. The choice of 
whether to perform the r or J/, integral first depends upon the prop¬ 
erties of these derivatives over the integration range. We note that 


2.8 Halo edf 

One practical problem with the above EDF is that any star that falls 
outside the allowed range in [Fe/H] (e.g. [Fe/H] < F ra ) is deemed 
unphysical by the model (assuming negligible errors in the metal- 
licity). This problem can be solved by the inclusion of a DF for the 
stellar halo. The data to which we will fit the EDF are not very sen¬ 
sitive to the stellar halo, but the inclusion of a halo DF allows us to 
assign any ‘unphysical’ star to the halo. 
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We construct a simple action-based distribution function for 
the halo of the form (Posti et al. 2015) 

/halo(J) = (J 0 + Jr + 0.681 J 0 |+O.7J,)3' (28) 

The model generated by this DF has a simple power-law density 
profile with a core that is specified by the parameter Jo. We choose 
Jo = 180kms _1 kpc. This model has a density profile p oc r ~ 3 
outside its scale radius of r ss 5 kpc, and p ~ const, inside, and 
has velocity dispersions at the Sun of au « aw ~ 130kms _1 
(Brown et al. 2010). The factors multiplying | Jf and J z are ap¬ 
proximately S 2 ^/fi r and f l z /Q, r at the solar position such that the 
halo model is approximately isotropic. In addition to this action- 
based part, we include a simple Gaussian in metallicity such that 
our halo EDF is given by 

p— ([Fe/H]—F h )/2af, 

/haio(J,[Fe/H]) = / halo (J)- 7 == -. (29) 

f2na 2 F 

We set the mean metallicity as Fh = —1.5 dex and the width of the 
metallicity distribution function as <t.f = 0.5 dex. We assume all 
stars in the halo are of age 12 Gyr. In what follows, the weight of 
the halo, fchaio, is allowed to vary, but we expect that it will be such 
that the halo contributes ~ 0.1 per cent in the solar neighbourhood 
(Juric et al. 2008). 


3 DATA 

Here we introduce the data to which we will fit EDFs and 
with which we will then test its predictions. We require seven¬ 
dimensional data (six phase-space coordinates and the metallic¬ 
ity [M/H]). We use data from the GCS and SEGUE survey, 
complemented by the stellar density data from Gilmore & Reid 
(1983). Additionally, we place the Sun at Ro = 8 kpc and zo = 
0.014 kpc (Binney et al. 1997) and we use the peculiar solar ve¬ 
locity from Schonrich, Binney, & Dehnen (2010): (vr, vj,, v z )q = 
(—11.1,12.24, 7.25) kms -1 (recall positive vr is away from the 
Galactic centre and positive v$ is in the direction of Galactic rota¬ 
tion such that (vr, v^, v z ) form a left-handed coordinate system). 

3.1 Geneva-Copenhagen Survey 

The Geneva-Copenhagen Survey (GCS) (Nordstrom et al. 2004; 
Holmberg et al. 2009) is a sample of 16 682 nearby F and G stars 
extending out to ~ 200 pc. Through a combination of uvby/3 pho¬ 
tometry, line-of-sight velocity, Hipparcos parallax and proper mo¬ 
tion observations, the catalogue provides a view of the chemo- 
dynamical structure of the solar neighbourhood. We use the most 
recent re-analysis of the survey by Casagrande et al. (2011). These 
authors used the infrared flux method (IRFM) to produce more con¬ 
sistent effective temperature and metallicity scales. This re-analysis 
found that the stars were on average 0.1 dex more metal rich than in 
previous analyses. We use all stars in the catalogue with proper mo¬ 
tions that were flagged by Casagrande et al. (2011) as having reli¬ 
able metallicity determinations. This reduces the data set to 12 723 
stars. 

Because the GCS is a local survey, it is dominated by thin-disc 
stars, and the influence of the thick disc is subtle (Binney 2012a). 
Due to the accuracy of the Hipparcos astrometry, the GCS pro¬ 
vides us with a precision velocity distribution in the solar neigh¬ 
bourhood, and led to the discovery of substructures in the ( vr , vf) 
plane (Dehnen 1998). In particular, the peak of the vp distribution 


is associated with the Hyades moving group, and the Hyades and 
Sirius moving groups endow the vr distribution with a flat top. 
The v z distribution appears free of substructure (Dehnen 1998). 
The presence of substructure is important when attempting to fit a 
zeroth-order model as it makes the comparison of model and data 
more difficult to interpret. 

The data we use for each star are (l, b, ur, i>||, p, [Fe/H]), 
along with the corresponding errors, where zu is the parallax, p 
is proper motion and the other symbols have their usual mean¬ 
ings. We adopt the reported errors in (l, b, w,v\\, p), and following 
Casagrande et al. (2011) we use <7[ Fe / H ] = 0-12 dex for all stars. 

3.2 SEGUE G dwarfs 

The Sloan Extension for Galactic Understanding and Exploration 
(SEGUE Yanny et al. 2009) is a low-resolution spectroscopic sur¬ 
vey of stars fainter than r = 14, complemented by ugriz pho¬ 
tometry. As such, it provides a view of the outer parts of the disc, 
dominated by the thick disc, and the stellar halo of the Galaxy, 
and so complements the more local GCS sample. The SEGUE 
data are available as part of SDSS DR10 (Ahn et al. 2014). These 
data were reduced using an improved SEGUE Stellar Parameter 
Pipeline (SSPP) (Smolinski et al. 2011), which, like the latest GCS 
re-analysis, used the IRFM to produce more consistent effective 
temperatures. However, this did not significantly affect the obtained 
metallicities. 

Here we use only SEGUE data that satisfy the target selection 
criteria for G dwarfs: a star with 14 < r < 20.2 and 0.48 < 
g — r < 0.55. From the G dwarfs, stars were selected that (i) have 
r ^ 15 to ensure completeness at the bright end, (ii) are given 
valid parameter estimates by the SSPP, and (iii) were not flagged 
as noisy or with a temperature mismatch. In addition, we impose a 
cut in surface gravity, log g ^ 4.2, to ensure we include only dwarf 
stars, we remove both stars with SNR< 15 and stars in fields with 
E{B — V) ^ 0.3 on the Schlegel et al. (1998) extinction maps. 
Finally we remove stars with no measured line-of-sight velocity or 
proper motions. The final sample contains 18 575 stars. 

We estimate the distances to SEGUE stars using the method 
presented in Schlesinger et al. (2012). The majority of the stars 
are from the outer disc, so we expect them to be old. We, there¬ 
fore, assume all stars have an age of 10 Gyr. Using the 10 Gyr 
YREC isochrone provided by An et al. (2009), we first bracket the 
provided metallicity for each star with two isochrones. For each 
isochrone, we find the closest entry to the star’s reported (g — r ) 
colour. The ugriz magnitudes are found by linearly interpolating 
between the two entries in each isochrone. The distance s is deter¬ 
mined by averaging the five estimates from each of the extinction- 
corrected ugriz bands. We make no consideration of the errors 
in the colours, magnitudes and metallicities. A Bayesian distance- 
estimation algorithm, such as that presented by Burnett & Bin¬ 
ney (2010), would be preferable. However, for dwarf stars, we ex¬ 
pect the cruder approach of Schlesinger et al. (2012) to be ade¬ 
quate - they estimate the errors in their distances and conclude that 
there is a random distance uncertainty of 18 per cent for stars with 
[Fe/H] > —0.5 dex, and 8 per cent for more metal-poor stars, in 
part due to errors in the isochrones. Additionally, there are sys¬ 
tematic distance uncertainties arising from the single-age assump¬ 
tion (expected to produce a 3 per cent distance overestimate for the 
metal-rich stars) and the presence of undetected binaries produces a 
~ 5 per cent distance underestimate for approximately 65 per cent 
of the population. In what follows we neglect both of these system¬ 
atic errors. 
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3.3 Gilmore-Reid density curve 

Gilmore & Reid (1983) measured the stellar density as a function 
of distance away from the Galactic plane, by observing a sample of 
K dwarfs towards the South Galactic Pole. This was the first study 
to indicate the existence of a thick disc. 


4 SELECTION FUNCTIONS 


Before comparing our model to data, we must understand the se¬ 
lection effects of a survey. In this section we discuss (i) how we 
include the selection function in our modelling approach, (ii) the 
selection functions for the GCS and the SEGUE survey, and (iii) 
the possibility of avoiding explicit use of a selection function. 

The selection function of a survey is the probability of a star 
being in the catalogue given its properties. The selection is nearly 
always done on the basis of the observed properties of a star. If we 
denote S as meaning ‘in the survey", the probability of an individ¬ 
ual stellar datum, D, given the model, M, and given it is in the 
survey, S , is 


p(D\M,S) 


p(S\D)p(D\M) 

p(S\M) 


(30) 


where we callp(S'|D) the selection function, p(D\M) the distribu¬ 
tion function, and p(S\M) is the probability that a randomly cho¬ 
sen star in the Galaxy enters the catalogue given a particular model. 
p(S\M) only comes into play when fitting the model to the data. 

Stars are selected for inclusion in a spectroscopic survey, such 
as SEGUE, from a photometric survey on the basis of criteria in¬ 
volving apparent magnitude and possibly colour. To relate colours 
and magnitudes to the physical properties of stars such as age 
and metallicity, we require a set of isochrones. We use 19 BaSTI 
isochrones (Pietrinferni et al. 2004) spaced by ~ 0.25 Gyr for 
t < 2 Gyr and 1 Gyr for r > 2 Gyr for each of the 12 metal- 
licities listed in Table 2. We assume that all populations of fixed 
metallicity and age were born with a universal initial mass function 
(IMF), £(m). We adopt the Kroupa et al. (1993) IMF 


0.035 to- 13 

if 0.08 ^ m < 0.5 


0.019m" 22 

if 0.5 ^ m < 1.0 

(31) 

0.019m" 2 - 7 

if m ^ 1.0. 



Here m is the mass of the star in units of the solar mass. With this 
choice, we can write down our full distribution function as 


f(x, v, [Fe/H], r, m) = f(x, v, [Fe/H], r)£(m) 

= (27t) 3 J dj; /( J, [Fe/H], r, 

(32) 

Below we assume that the selection functions involve appar¬ 
ent magnitude, colour, l and b. A combination of m, [Fe/H] and r 
taken with the isochrones uniquely determine a colour and an ab¬ 
solute magnitude. Coupled with a distance, the absolute magnitude 
implies an apparent magnitude. Therefore, we can consider a selec¬ 
tion in colour and magnitude as a selection in mass, metallicity, age 
and distance. 

If we want to determine the distribution of the arguments of 
our distribution function, X = (x, v, r, [Fe/H], J^), with a selec¬ 
tion of this form, we write 

p(X\M,S) oc J dmp(S , |s, m, r, [Fe/H], l, b)£(m)f(X) 

= p(S\s,r, [Fe/H], l, b)f(X), 


Table 2. Metallicites of the BaSTI isochrones used. 


Z Y [Fe/H] 


0.00001 

0.245 

-3.27 

0.0001 

0.245 

-2.27 

0.0003 

0.245 

-1.79 

0.0006 

0.246 

-1.49 

0.001 

0.246 

-1.27 

0.002 

0.248 

-0.96 

0.004 

0.251 

-0.66 

0.008 

0.256 

-0.35 

0.01 

0.259 

-0.25 

0.0198 

0.2734 

0.06 

0.03 

0.288 

0.26 

0.04 

0.303 

0.40 


where 

p(S\s,r, [Fe/H], l, b) = j dmp(S\s,m,T, [Fe/H], l, b)£(m). 

(34) 

p(S|s, r, [Fe/H], l, b) can be calculated independently of the dy¬ 
namical model. In particular, if the selection function is indepen¬ 
dent of l and b we must only engage with the isochrones once. The 
resulting pre-tabulation can then be interpolated for any choice of 
(r, s, [Fe/H]); any call outside the grid uses the nearest grid point. 


4.1 GCS selection function 

For the GCS, we use the selection function in SB09, which was ap¬ 
proximately constructed using the selection rules from Nordstrom 
et al. (2004) and comparison with the target catalogues. The re¬ 
sult is a simple selection function in Stromgren colour (6 — y) and 
apparent magnitude, v. We assume that the selection function is in¬ 
dependent of the line-of-sight such that p(S\s,r, [Fe/H], Z, 6) = 
p(Sjs, r, [Fe/H]). Following SB09, we first cut the isochrones at 
the bottom of the red giant branch (flagged in each isochrone and 
corresponds to a minimum in the luminosity for high-mass stars). 
This cut fails to remove many model stars that lie on the subgiant 
branch of an isochrone, and the GCS does not include such objects. 
Therefore, we also cut all isochrone points with M y < 1 and M y < 
—62.5(logT e ff — 3.78) to reproduce approximately the edge of the 
sample observed by Casagrande et al. (201 1) 1 . For each isochrone, 
we form a grid in the logarithm of distance between a minimum 
value and the value at which the selection function falls to zero. At 
each distance s, we find J dmp(SGCs\s,m,T, [Fe/H])£(m) for 
each isochrone. Thus, we construct a three-dimensional grid over 
which we can interpolate given any triple (s, r, [Fe/H]). Fig. 4 
shows the resulting selection function for a star located at s = 
60 pc (approximately the peak of the GCS distance distribution). 
We see that it peaks at around 2 Gyr, where the majority of GCS 
stars lie (Casagrande et al. 2011). 


4.2 SEGUE selection function 

In Bovy et al. (2012c), it is shown that the SEGUE G dwarf selec¬ 
tion is uniform in (g — r ) and a near step function in r. The position 


1 Importantly M y m My and M v = M y + 2(6 y) + mi. 
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Figure 4. GCS selection function for a star at 60 pc. The left panel shows the selection in age coloured by metallicity, and vice versa for the right panel. 


of this step depends upon the plate P, in particular whether it is a 
bright or faint plate, and has the functional form 

p(S\ r , g-r,P)= W ^[ 1- ^h( r ~ ex r - u ( t _+ 0 - 1 )], (35) 

where r cu t depends upon the plate, and Wp is the fraction of 
SDSS targets that have spectra. The selection function is set to zero 
outside the r magnitude interval [14.5,17.8] for bright plates and 
[17.8, 20.2] for faint plates. Additionally we set the selection func¬ 
tion to zero for logy < 4.2 dex. We use the publicly available 
code from Bovy et al. (2012c) to find the location of r cu t for each 
plate by comparison with SDSS. Note here that, unlike the GCS, 
the selection function depends upon the line-of-sight l and b as the 
selection function depends upon the plate P. In Fig. 5, we show the 
selection function for a star at 2.5 kpc observed in the faint plate 
#1881. We see that it is approximately flat with age and falls to 
zero for [Fe/H] > — 0.3 dex. Even without a physically-motivated 
model, i.e. one in which the stars at high altitude are metal-poor, 
the SEGUE selection function for the faint plates is such that metal- 
poor stars are preferentially selected. However, the selection func¬ 
tion for the bright plates includes many metal-rich stars. 


4.3 Side-stepping the selection function 

We have just detailed appropriate selection functions for the GCS 
and the SEGUE G dwarf sample. However, sometimes the selection 
function of a survey is either overly complex or hard to determine 
because it depends on possibly un-documented historical and social 
factors - particular objects might have been added to the target list 
because they were suspected of some special property, while other 
objects fall away on account of unusual difficulties. This is very 
likely to be the case when a survey is still underway. In these cases, 
it may be difficult to construct the selection function accurately. 
Here we discuss how a model may be compared to data without 
reference to the selection function. 

Say, from a survey, we have for each star all the observables 
in the set X. If we know that the survey is constructed by selecting 
on a subset of these observables, denoted y C X, then there is 
a subset of observables x = X — y on which there has been no 
explicit selection. For instance, we may know that the selection was 
performed in colour but not in velocity. Therefore, the velocities are 
free from an explicit selection although they are implicitly biased 
by the selection since any relation between velocity and colour, e.g. 


bluer stars are younger and so have a lower velocity dispersion, 
implies that selecting blue stars lowers the velocity dispersion of 
the sample. 

In this case, we proceed by considering the conditional prob¬ 
ability p(x\y, S). We write 


p{x\y,S) 


p(x,y,S) = p(S\x,y)p(X) 
p(y,S) p{S\y)p(y) 
p(S\x, y)p(X) 
p{S\y) I d n xp(y,x)' 


(36) 


We know that the selection is only in the observables y so 
p(S\x, y) = p(S\y). Therefore, we find that 


p(x\y,S) 


P(X) 

f d n xp(y, x) 


p{x\y). 


(37) 


This equation states that the conditional probability of the observ¬ 
ables x given the other observables y and the fact this star is in the 
sample S is just the conditional probability of the observables x 
given the other observables y regardless of how the selection on y 
is made. All we need to know is that we have selected only on y. 

This argument underpins models of the disc obtained from ve¬ 
locity histograms (Binney 2012a; Binney et al. 2014). For a survey 
such as RAVE, the selection is not performed on the line-of-sight 
velocities or proper motions. Therefore, given a set of observables 
y for each star, we can sample a set of velocities v from the model. 
These velocities may then be used to construct histograms to com¬ 
pare with the data. The approach taken by Piffl et al. (2014) was 
first to bin the data in y and then to sample the distribution in v at 
the centroid of each bin. For this approach, we should dissect the 
sample into small cells in y-space because without the selection 
function we do not know with what weights the p(x,y) should 
be added to form the sample-wide distribution p(cc|,S'). Finally, we 
note that for the above approach to be mathematically valid, y must 
contain all observables used in the selection. The approaches of 
Binney et al. (2014) and Piffl et al. (2014) did not use the appar¬ 
ent magnitude information of each star. Additionally, Piffl et al. 
(2014) select a giant subsample of the RAVE survey using log g. 
Therefore, to use the above approach on this subsample we have 
y = (l, b, I, (J — K), log g ). By neglecting to use I and instead 
using the distance, we have implicitly biased the age distribution of 
the stars and hence the velocity distributions. 

The disadvantage of this route around the problem of the se¬ 
lection function is that we end up not using the available informa- 
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Figure 5. SEGUE G dwarf selection function for a star at 2.5 kpc for the faint plate #1881. The left panel shows the selection in age coloured by metallicity, 
and vice versa for the right panel. 


tion to the full. Consider, for instance, the case of the RAVE veloc¬ 
ity sampling. With a full dynamical distribution function, the power 
comes from the link between the spatial and velocity distributions, 
which are connected by the potential. For a fixed potential, we may 
be able to find a velocity distribution that matches the data, but the 
spatial distribution will then match the data only if we have found 
the true potential. This is the principle that was used by Piffl et al. 
(2014) to constrain the distribution of dark matter. If we consider 
only velocity information, we lose this power and can only rely on 
spatial gradients in the velocity distribution for any constraint. To 
constrain the potential, one must engage with the selection function 
of some survey, which Piffl et al. (2014) implicitly did by adopting 
the vertical density profiles of Gilmore & Reid (1983) or June et al. 
(2008). 

The second problem with the above approach is that it is more 
expensive computationally than an approach that uses the selec¬ 
tion function. If we use equation (30) to fit the data, we must cal¬ 
culate the denominator p(S\M) once to sufficient precision that, 
the error in the Nth power of p(S\M) (where N is the number of 
data points) does not dominate our posterior probability (McMillan 
& Binney 2013). However, this is a single computation for each 
considered model. When we do not explicitly use the selection 
function, as in equation (37), we must compute the denominator 
f d n xp(y, x) for each star to comparable precision. One way of 
approaching this is to tabulate the integral on a grid in y and in¬ 
terpolate. However, the dimensionality of y can be large, and it is 
challenging to reduce interpolation errors such that the noise does 
not dominate the posterior probability. 

In conclusion, one can model a survey without knowing the 
selection function, hut at increased computational cost and giving 
reduced diagnostic power for a given model than when the selection 
function is used. However, in certain cases one may have no choice 
but to proceed without the selection function. In the following, we 
do use selection functions. 


15 parameters: 

Thin. 7?d,thni 0Vo,thn, tTzO,thnj 

Thick. 7?d,thk> Ifrj. 1 .hk ■ &rO,th k• ^z 0 .t lik , T s \ 

Metallicity: a L o , tf , Fr , F m , vf ; 

Halo: fchaio- 

We seek to maximise p(D\Sgcs, M) given by 
p(D\Sgcs, M) = Y[p(h,bi,Wi,viii,y,i, [Fe/H] i |S' G cs, M), 

i 

(38) 


where 


p(l,b,zu,v ||, /la, [Fe/HH^GCS, M) 


1 

p(Sgcs\M) 


x J d 5 g' G 5 {g - g\ cr g ) s' 6 cos b 
x J drp(5 G cs|s / , r, [Fe/H ]')f(x',v',T, [Fe/H]'), 


(39) 


where G 5 is a 5D Gaussian to give the convolution of the observ¬ 
ables g = (zu, [Fe/H], D||,/x) with the errors a g and the primed 
quantities are functions of g'. p(Sgcs|s , ,G [Fe/H/) is the selec¬ 
tion function as detailed in Section 4.1. This approach uses uni¬ 
form priors on all model parameters. However, we note that our re¬ 
sults are essentially independent of our choice of weak prior on the 
model parameters. We perform the integral over the errors (g 1 ) us¬ 
ing Monte Carlo integration. Following McMillan & Binney (2013) 
we use a fixed sampling of 50 points per star to remove numerical 
noise and to allow pre-computation of the actions: if we were to 
draw new Monte Carlo sampling points for each set of parameters 
in the EDF, our parameter choices would be dominated by numeri¬ 
cal noise. 

We assume that the completeness along each line-of-sight is 
the same such that 


p(Sgcs\M) = J dldbdss 2 cosb 

x J d 3 i;d[Fe/H] drdj/p(S'|s,r, [Fe/H]) 
x f{x,v,r , [Fe/H], 4). 


5 CHOICE OF PARAMETERS 

We now turn to fitting our EDF to data. To choose the parameters 
of our model, we use the GCS and Gilmore-Reid data, and vary the 


© 2014 RAS, MNRAS 000, 1-24 















Extended distribution functions for our Galaxy 11 


We perform this integral using the VEGAS algorithm implemented 
in the CUBA package Hahn (2005). 

We also use the Gilmore-Reid data in the fits. The log- 
likelihood of the Gilmore-Reid data is given by 

log^GH = Y I lo glo[pGR( g )/pDF(*)] I 3 (41) 

“ I ct G r(«) I 

where pgr is the density profile from Gilmore & Reid (1983), ctgr 
are the errors in log 10 (pGR) and pdf is the density profile calcu¬ 
lated using the EDF integrated over all metallicities. The quantity 
we seek to maximise is 

log .5? = logp(DGCs|*S' G cs, M) + xlog^GR- (42) 

We perform this procedure using the Nelder-Mead multi¬ 
dimensional minimization routine (Nelder & Mead 1965) imple¬ 
mented in the Gnu Science Library (Galassi et al. 2009). \ is some 
weight which we set to 10. This is to encourage the fitting pro¬ 
cedure to take the rather few Gilmore-Reid data points seriously. 
The introduction of \ can be considered as a restrictive prior on the 
density of the models considered in fitting the GCS. Additionally, 
it dilutes the effects of substructure in the GCS, which we do not 
want to fit. 


6 RESULTS 

The parameters selected by the Nelder-Mead algorithm are given 
in Table 3. We note that whilst our procedure is statistically sound, 
we know the model will not perfectly match the data because (i) 
we are using a fixed potential, (ii) our model ignores substructure 
like that seen in the GCS velocity distribution, and (iii) the Nelder- 
Mead algorithm is likely to select a local minimum, particularly for 
high-dimensional problems. 


6.1 MCMC exploration of model space 

In addition to the minimization routine, we have also performed a 
fuller MCMC search of the parameter space. The parameters cho¬ 
sen by the Nelder-Mead simplex algorithm lie within two standard 
deviations of the centres of the one-dimensional posterior distribu¬ 
tions from the MCMC chains. 

As expected from a very local sample, the scale-lengths of 
the discs are not well constrained by the GCS data. The thin disc 
velocity-dispersion parameters are, by contrast, constrained to less 
than lkms -1 . Similarly, the thick disc radial velocity-dispersion 
parameter is constrained to about lkms -1 , whilst the vertical 
velocity-dispersion parameter is anti-correlated with the weight 
of the halo and so not as well constrained. Additionally, the cur¬ 
rent ISM gradient parameter is constrained to Fr = (—0.064 ± 
0.002) dex/kpc so that at the Sun the ISM metallicity gradient 
is ~ —0.061 dex/ kpc. This agrees remarkably well with the ra¬ 
dial metallicity gradient for Cepheids from Genovali et al. (2014) 
(—0.060 ± 0.002 dex/ kpc). The radial migration parameter olo 
is ~ 1200 km s -1 kpc such that the oldest stars have migrated 
~ 5 kpc during the lifetime of the Galaxy, but much higher migra¬ 
tion strengths (~ 1600 km s -1 kpc) are also compatible with the 
data. One might anticipate, however, that the data might be con¬ 
sistent with lower radial migration strength and steeper metallicity 
gradient, which interestingly is not favoured by the model. It ap¬ 
pears that the youngest stars constrain the local metallicity gradient 


whilst the width of the metallicity distribution constrains the de¬ 
gree of radial migration required. As we have only explored a sin¬ 
gle ISM metallicity parametrization the tight constraint on the local 
metallicity gradient may be a result of our tight parametrization and 
other variations of the metallicity with time could be explored. For 
instance one of the Chiappini et al. (2001) models produces a shal¬ 
lowing metallicity gradient with time at late times which is not a 
possibility with our parametrization. 

We will not report further on the results of the MCMC 
searches of parameter space because we have only varied the pa¬ 
rameters of the EDF, and not the parameters of the potential; we 
defer a full fitting procedure (including the potential) to a future pa¬ 
per. Since we are not fitting the potential, we cannot hope to obtain 
a statistically optimal representation of the data. Rather we seek a 
good model. Below we examine how well the data are fitted by the 
model chosen by the Nelder-Mead simplex algorithm as described 
in Section 5. 


6.2 The best-fitting model 

We begin by inspecting the global properties of the model. Fig. 6 
shows the global radial and vertical profiles of the full model and of 
the individual components. Along with the radial profiles we have 
plotted the profiles of the discs at their birth and we see that our 
radial migration prescription has preserved the radial disc profiles. 
The thick disc profile is slightly broadened as we are using the thin 
disc scale-length to calculate D^ 1 '. In the vertical density panel we 
also show the Gilmore & Reid (1983) data which are well fitted by 
our EDF. 

In Fig. 7 we plot the radial density profiles for different sub¬ 
populations. In the left panel we show the radial density profiles for 
super-solar and sub-solar metallicity populations. We observe that 
the total profile is exponential whereas neither sub-population can 
be well modelled by a single exponential. This finding contrasts 
with the contention of Bovy et al. (2012c) that all mono-abundance 
populations have exponential density profiles. In the right panel of 
Fig. 7 we show a further subdivision into four bins: the thin disc 
with [Fe/H] < —0.2 dex and with [Fe/H] > —0.2 dex, and the 
thick disc with [Fe/H] < —0.2 dex and [Fe/H] > —0.2 dex. If we 
fit an exponential to the density profiles of these structures in the so¬ 
lar neighbourhood, we find local scale-lengths of 17 kpc (2.7 kpc) 
for the thin disc metal-poorer (richer) than [Fe/H] = —0.2 dex, 
and 3.2 kpc (2.1 kpc) for the thick disc metal-poorer (richer) than 
[Fe/H] = —0.2 dex. We obtain a very long scale-length for the 
low-metallicity thin-disc population because its profile is not really 
an exponential. The analysis of the SEGUE G dwarf data in Bovy 
et al. ( 2012 c) produced a similar result. 

6.2.1 Sampling mock catalogues 

To compare our model and data, we sample mock catalogues from 
the model. At the reported l and b of each star in a catalogue 
we sample distance, metallicity and velocity. We use a simple 
rejection-sampling technique using a uniform distribution in dis¬ 
tance and Gaussians in metallicity and Galactocentric velocities. 
We sample from the distributions p(s, [Fe/H], v r , v z \h, bi , Sk) 
(where i denotes the datum and k the survey), convert the quanti¬ 
ties to Galactic coordinates and scatter by the reported error in the 
datum. For this procedure to be valid, we require the errors in the 
observables to be independent of s, [Fe/H], vr, v$ and v z . In the 
histograms that follow, the errorbars show the Poisson error and do 
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Table 3. EDF parameters: parameters found from the fitting procedure presented in Section 5. Additionally we give a brief description of the physical meaning 
of each parameter. It is these parameters that are used to produce mock catalogues in Section 6 . 


Thick 


-Rd/kpc 
R a / kpc 

ffro/kms -1 

ffzo/kms -1 


2.31 Thick disc scale-length 

6.2 Thick disc velocity-dispersion scale-length 

50.5 Thick disc radial velocity-dispersion parameter 

51.3 Thick disc vertical velocity-dispersion parameter 


Thin 


Rd/ kpc 
Ra / kpc 
ff r o/kms _1 
o-jjo/kms -1 

"FW dex kpc 1 

Fm/ dex 
t s / Gyr 

0\lo/ 1OO kpckms -1 
W Gyr 
rp / kpc 
fchalo 


3.45 Thin disc scale-length 

7.8 Thin disc velocity-dispersion scale-length 

48.3 Thin disc radial velocity-dispersion parameter 

30.7 Thin disc vertical velocity-dispersion parameter 


Other 


Not fitted ft 


—0.064 ISM radial metallicity gradient at solar radius 

—0.99 ISM metallicity at birth of Galaxy 

0.43 Early star formation growth timescale 

11.5 Radial migration strength 

3.2 ISM metallicity enrichment timescale 

7.37 Radius of current ISM solar metallicity 

1.2 x 10 -3 Stellar halo weight 


Pz 

Tm/ Gyr 
it/G yr 
17 /Gyr 
ri / Myr 
F h / dex 
ah/ dex 


0.33 Thin disc radial heating growth parameter 

0.4 Thin disc vertical heating growth parameter 

12 Age of Galaxy 

10 Age of thin-thick disc separation 

8 Thin disc star-formation rate decay constant 

110 Parameter to control velocity dispersions of stars born today 

— 1.5 Mean metallicity of stellar halo 

0.5 Width of metallicity distribution of stellar halo 


— Thin 


Thick 


Halo — All 


Thin 


Thick 


Halo — All 




zj kpc 

Figure 6. Radial (at 2 = 0) and vertical (at R = Rq) density profiles for the full extended distribution function. The colour-coded dotted lines (key above 
panel) show the contributions of each component to the composite profile (full red line). The blue and green dashed lines in the left panel show the radial 
profiles of the discs at their birth (note they are slightly offset for visibility), demonstrating that our radial migration prescription does not broaden either disc 
significantly. In the right panel we also show (black points) the Gilmore & Reid (1983) data. 


not give any indication of the size of systematic errors in the mod¬ 
els. We use black points for the data, and coloured points for mock 
catalogues drawn from the EDF. 


6.2.2 Apparent magnitude cut 


Before comparing to actual data, we perform a simple experiment 
to demonstrate the importance of selection functions and of adding 
metallicity information to the DF. We construct a sample of 10 000 
stars along the line-of-sight l = 0, b = 45° with the selection 
function 


p(S\V) 


1 if V < 8 
0 otherwise, 


(43) 


which creates a magnitude-limited sample. We note that our sam¬ 
ple will not contain any stars with m < 0.5 Mg as the BaSTI 
isochrones do not contain any points with m < 0.5M 0 . Approxi¬ 
mately 0.05 per cent of the stars in our sample have m < 0.55 Mq, 
so we anticipate that there should be fewer than 10 stars in our sam¬ 
ple with m < 0.5M@. This small number should not affect the 
statistics. The resulting mock metallicity and velocity distributions 
are shown in red in Fig. 8. We now turn the selection function off 
(set p(S'|s, t, [Fe/H]) = 1) and resample the velocities given l , b, 
distance and metallicity. The blue points in Fig. 8 show the result¬ 
ing distributions. If there were no correlations between the kine¬ 
matic and intrinsic properties of stars, the velocity distributions of 
the two mock catalogues would be identical. However, the stars of 
a young cohort are on average more luminous than the stars of an 
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— Thin, [Fe/H] <-0.2 Thick, [Fe/H] <-0.2 



Figure 7. Radial (at z = 0) density profiles split in chemistry. The left panel shows the total profile (in red) along with the contributions from super-solar and 
sub-solar metallicity stars. The right panel shows the contributions from four bins: thin and thick disc split into [Fe/H] < —0.2 dex and [Fe/H] > —0.2 dex. 


older cohort, and consequently the proportion of younger stars in 
a survey is greater than the proportion of these stars at any loca¬ 
tion in the Galaxy. Since young stars tend to have smaller random 
velocities than old stars, an observational sample has smaller ve¬ 
locity dispersions than all the stars at a given location. When we 
turn off the selection function, we are not biased in age and we ob¬ 
tain broader velocity distributions that reflect the underlying distri¬ 
butions. The Kolmogorov-Smirnov probabilities, Pks, that the red 
and blue samples in Fig. 8 are drawn from the same distribution are 
< HT 11 . 

We performed this experiment with a fainter cut, V = 10, and 
found that the difference between the two velocity distributions was 
reduced. Adopting this fainter cut decreases the relative proportion 
of nearby young stars and increases the proportion of distant old 
stars. Consequently, the resulting sample is more representative of 
the underlying population in all respects, including its velocity dis¬ 
tributions. 

The selection function used in this experiment is simpler than 
the selection function of a real survey but it serves to show the 
importance of grappling with the selection function. Given the de¬ 
pendence of luminosity and colour on metallicity, this requirement 
obliges us to work with EDFs rather than DFs. 

6.2.3 GCS mock catalogue 

We now build a mock GCS by sampling our model by the tech¬ 
nique described at the start of Section 6.2.1 using the selection 
function given in Section 4.1. The sampling neglects any correla¬ 
tion between the errors and distance, metallicity or velocity. In the 
GCS the most prominent correlation is that between the parallax er¬ 
ror and the V magnitude (Pearson correlation coefficient between 
logU and log fa 0.6). However, the parallax error does not 
correlate as well with the parallax (Pearson correlation coefficient 
between log w and log fa —0.2), so we neglect this effect and 
simply use the reported parallax errors for each star. 

Fig. 9 shows the distance distributions of the data (black 
points) and the mock sample. The peaks of the model and data dis¬ 
tributions match well but the mock sample is slightly skewed with 
respect to the data sample in the sense that the latter has too few 
distant stars and too many close stars. This potentially indicates 
that the selection function is sub-optimal. We have assumed that 


the angular selection is isotropic which may bias the distance dis¬ 
tribution slightly. Note, additionally, we anticipate that the nearer 
stars will have, on average, been assigned higher distance errors 
than expected, whilst the more distant stars will have been assigned 
smaller distance errors. From the correlation coefficients quoted 
previously, we expect this effect is small. However, it would pro¬ 
duce a discrepancy of the type encountered here. 

Fig. 10 shows the metallicity and velocity distributions. 
The model (red points) matches the data (black points) well for 
[Fe/H] > —1 dex. For [Fe/H] < — 1 dex the model slightly un¬ 
derpredicts the data but the significance of the mis-match is low as 
there are only a few stars at these low metallicities. The fitting pro¬ 
cedure seeks to match the high -2 points from the Gilmore-Reid data 
at the expense of underpredicting the number of low-metallicity 
stars. Clearly this deficiency is due to the simplicity of our halo 
model. We will see again when inspecting the SEGUE data that 
our halo model is not optimal. 

The velocity distributions in Fig. 10 are fitted well by the 
model (red points), particularly those for vr and v z . The model v$ 
distribution fails to match the peak of the data distribution, but this 
peak is due to the Hyades stream, a non-equilibrium feature that we 
are not concerned with matching. Fig. 11 shows the contributions 
to the mock catalogue from thin-disc, thick-disc and halo stars. We 
see that, as expected, the thin disc dominates over most of the space 
explored although the thick disc dominates for [Fe/H] < —0.6 dex 
and v z > 50kms _1 . 

In Fig. 10, blue points show a mock catalogue generated by 
sampling new metallicities and velocities at the solar position with 
the selection function turned off. If the GCS velocity histograms 
were fair samples of the local velocity distributions, the blue and 
red points would agree. Actually the blue velocity histograms are 
broader than the red ones. Also, the blue metallicity distribution 
has a broader metal-poor wing. These differences arise because by 
turning off the selection function, we have increased the number of 
high-age stars - Fig. 4 shows that the typical age of GCS stars is 
~ 2 Gyr. 

We find the best-fitting vertical velocity-dispersion parame¬ 
ter for the thin disc is <r 2 o,thn ~ 30kms _1 . This value is larger 
than previous values, for example those 20 to 26kms _1 derived 
by B12, because we include the bias in the GCS towards younger 
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Figure 8. Mock catalogues for a magnitude-limited survey. The red points give the mock catalogue constructed by sampling distances, metallicities and 
velocities along the line-of-sight l = 0, b = 7 t /4 for a magnitude-complete sample out to V = 8 . The blue points give the mock catalogue formed by 
resampling the velocities given the metallicities and distances of the first mock catalogue with the selection function switched off. In each plot, we show the 
Kolmogorov-Smirnov probabilities, pks> that the two samples were drawn from the same distribution. 
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Figure 9. GCS distance distributions with linear and logarithmic scales: black shows the data, red the mock catalogue. The mock catalogue is slightly deficient 
in distant stars but this will at least partly reflect an over-simplified model of the distance errors. 
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Figure 10. GCS metallicity and velocity distributions with a linear scale (left) and logarithmic (right): the top row shows the metallicity distribution, second 

the v 4 > distribution and the final row the v z distribution. The black show the data, red the mock catalogue, and blue if we 

only sample at the Sun. 
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Figure 11. Contributions to the GCS from the thin disc (black circles), the thick disc (red crosses) and the halo (blue triangles). The top left shows the 
metallicity distribution, top right the vr distribution, bottom left the v^ distribution and the bottom right the v z distribution. 
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Figure 12. 2D histograms in the planes (\z\, v^), ([Fe/H] , v </>) and ( [Fe/H] , \z\) - the coloured histogram shows the GCS data and the black logarithmically- 
spaced contours are for the mock GCS catalogue. The red and gold lines give the mean [Fe/H] in equal-width bins centred on the dots for the data and 
model. 


stars and report the velocity-dispersion of the underlying thin-disc 
population. 

Fig. 12 shows the density of stars in the (v#, | 2 |), ([Fe/H], v$) 
and ([Fe/H], |,z|) planes. The data and models match well and the 
red and gold points show that the means of v$ and |z| binned in 
1 2 1/metallicity are well recovered. 


Fig. 13 shows the gradients of the metallicity with respect to 
the guiding-centre radius and the metallicity distributions in three 
bins in vertical action J z . The two columns on the left show data 
and the two on the right show the mock GCS catalogue. In the 
two low-Jj, bins, the gradient is well recovered. The gradients in 
the high-J- bin are more discrepant although consistent within 3<r. 
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Figure 13. 2D histograms in the plane of metallicity [Fe/H] and guiding-centre radius R c and ID [Fe/H] histograms for three bins in J z (top row: J z < 
0.3kms —1 kpc, middle row: 0.3kms^ 1 kpc < J z < 1.3 kms —1 kpc, and bottom row: J z > 1.3kms _1 kpc). The left two plots show the GCS data 
whilst the right two show the mock GCS catalogue. We show the gradient d[Fe/H]/dr c for the samples in each bin only using data with 6 kpc < R, < 9kpc 
and —1.5 dex < [Fe/H] < 1 dex above the relevant plot, and the number of stars in each bin in the second and fourth column panels. The vertical line shows 
the location of the peak of the data metallicity distribution in the lowest action bin. 


Additionally, we find that the metallicity distributions for all three 
action bins are well matched. Note that the peak of the metallicity 
distribution remains fixed with increasing vertical action for both 
the data and the model. 


6.2.4 SEGUE G-dwarf mock catalogue 

We now examine a mock catalogue of SEGUE G dwarfs con¬ 
structed using the EDF of Section 6.2.3 and the selection function 
of Section 4.2. The comparison of a mock SEGUE catalogue with 
the real one is a rigorous test of our methodology in two respects. 
First, the input to the EDF from stars further than ~ 150 pc from 
the Sun is very small, being restricted to the Gilmore & Reid (1983) 
density profile. Hence the velocity and metallicity distributions in 
the mock catalogue are pure predictions and are exposed to inac¬ 
curacy of the adopted gravitational potential. Second, the distances 
used for SEGUE stars are based on a different set of isochrones 
from those used to derive distances to GCS stars. Any discrepancy 
between the isochrone sets can be expected to lead to mismatches 
between the mock and true catalogues. 

Our first attempt to make a mock catalogue revealed that the 
halo weight, fchaio, was too large, so we reduced fchaio by a factor 
of 6. The GCS does not constrain the halo tightly, and the strongest 
constraint on fchaio comes from the Gilmore-Reid data. After mak¬ 
ing this alteration, the densities at high z are slightly underesti¬ 
mated. While the degree to which we can trust these high-z densi¬ 
ties is unclear, it is likely that the need to revise fchaio reflects short¬ 
comings in our model. Indeed, the halo EDF is intentionally very 
crude, and there is the suggestion from Binney et al. (2014) that 
the J z distribution for the thick disc is not appropriate. Therefore, 


there is plenty of scope for adjusting the models, and the model we 
present here is surely sub-optimal. 

Fig. 14 shows that there is a small mismatch between the dis¬ 
tance distribution of the model (red) and that of the data (black). 
The model predicts slightly too few nearby stars and too many dis¬ 
tant stars. We can understand one potential cause of this trouble by 
inspecting the metallicity distributions of Fig. 15. The model metal¬ 
licity distribution peaks in the correct place and has a broad peak 
approximately the same width as the data peak. We note that the 
smooth increase in star formation rate at early times produces the 
rising low-metallicity edge of the distribution. At high metallicities 
([Fe/H] > 0 dex) the model predicts many more stars than are ob¬ 
served. Here we are seeing the thin disc of the model. It is these 
additional stars that seem to be distorting the distance distribution. 
At fixed apparent magnitude and colour, increasing the metallic¬ 
ity of a dwarf star shifts it in the HR diagram up from below the 
solar-metallicity main sequence and thus moves it to a larger dis¬ 
tance. We will return to the dominance of the metal-rich population 
below. 

Fig. 15 also shows the velocity distributions of the data and 
the model. The vr distribution is a good match though the model 
is slightly broader than the data for large |nfl|. This may be due 
to too much halo contribution (although from inspecting the metal¬ 
licity distribution it seems our halo weight is approximately cor¬ 
rect) or more likely the thick disc velocity dispersion is slightly too 
large. This second option is corroborated by the v$ distributions. 
The model fails to match the peak in the data and is broader than the 
data. Again, judging by the match to the counter-rotating stars it ap¬ 
pears our halo weight is correct. Decreasing the velocity-dispersion 
parameter of the thick disc would underpopulate the wings of the 
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Figure 14. SEGUE G dwarf distance distributions with both a linear and logarithmic scale: black shows the data, red the mock catalogue. 


GCS vr distribution so it seems that the solution is to adjust the 
potential. We anticipate that at least some parts of our model will 
be inconsistent with the SEGUE data as we have used a fixed po¬ 
tential. Interestingly the v z distribution broadly matches the data 
but the mean of the data is clearly offset from zero. There has been 
much in the literature recently associating mean vertical velocity 
shifts with modes in the disc (e.g. Widrow et al. 2012; Williams 
et al. 2013). However, this shift in the peak could also arise from 
systematic distance errors or zero-point errors in the SDSS proper 
motions. 

Fig. 16 shows the contributions of the thin and thick discs 
(black and red) and the halo to the metallicity and velocity dis¬ 
tributions. The top left panel clearly shows that the main peak and 
the high-metallicity wing in the overall metallicity distribution are, 
respectively, associated with the thick and thin discs. The absence 
of a high-metallicity wing in the observed metallicity distribution 
signals that the model exaggerates the contribution of the thin disc 
to the sample. We require a very dominant thin disc population in 
the plane to match the GCS data as well as provide a good fit to 
the Gilmore & Reid (1983) density curve. Additionally our high 
vertical velocity dispersion for the thin disc implies a larger scale- 
height such that thin disc stars are dominant up to ~ 0.9 kpc. How¬ 
ever, the SEGUE data require a much smaller thin disc contribution 
which seems to point towards a smaller thin disc scale-height. How¬ 
ever, for this option to then match the Gilmore-Reid density data 
we would require an adjustment to the potential. The data here are 
clearly very informative and require a very finely tuned model in 
order to describe all the features. 

Fig. 17 shows the distribution of stars in the (|z|,u^), 
([Fe/H], vd), and ([Fe/H], \z\) planes - colours show data and 
contours the model. The contours wrap round the peaks of the data 
distribution very nicely, but in the central and right panel they ex¬ 
tend too far to the right, signifying that the model contains more 
metal-rich stars than the data. 

In each panel the red points show the mean abscissa in the data 
at the given ordinate, while the gold points show these means in the 
model. In the left panel we observe that the runs of mean \z\ with 
D 0 agree well, suggesting that the potential is close to the truth. On 
account of the surfeit of metal-rich stars in the model, the model 
means lie to the right of the data means for the central and right 
panels. In the central panel, this effect is largest at low t >0 and even 


reverses sign at i )0 > V c . The excess metal-rich stars in the model 
are contributed by the thin disc. 

The right-hand panel of Fig. 17 shows that the mean metal¬ 
licity decreases at the correct rate with \z\ throughout the SEGUE 
volume. However there is a systematic difference of ~ 0.1 dex in 
the mean metallicity. At low \z\ this is due to the unwanted pres¬ 
ence of the metal-rich thin disc in the model. 

In Fig. 18 the first column shows the density of SEGUE stars 
in the (R c , [Fe/H]) plane with the stars sorted into three bins by 
J z . The third column shows the same distributions for the mock 
catalogue. In each panel the stars cluster into a nearly horizontal 
band because at any value of [Fe/H] the stars are widely distributed 
in guiding-centre radius R c . Above each panel we give the slope 
d[Fe/H]/d/( c of the band’s ridge-line found from linear regres¬ 
sion. For the data distributions the gradient is positive, whereas it 
is negative for the top two model panels (lowest values of J z ) and 
close to zero in the bottom panel. The positive gradients of the data 
distributions echo the finding of Lee et al. (2011) that a-enhanced 
stars exhibit a positive gradient of with metallicity. 

The second and fourth columns of Fig. 18 show histograms of 
[Fe/H] for each bin in J z . Our model captures the essential features 
of the data although with varying degrees of success. The peaks of 
the data distributions are replicated by the model. At low J- the 
model distribution is too wide due to the unwanted contribution of 
the thick disc. At high J z the relative thick disc to halo weight is 
perhaps slightly off or the halo model needs adjusting. In particular, 
our model produces a small peak at —1.5 dex which is the centre 
of the Gaussian used to model the halo metallicity distribution. The 
data is unimodal and steadily declines towards low metallicity. 


6.2.5 Adjusting the model 

Our model has performed surprisingly well in predicting the 
SEGUE G dwarf data considering we have not adjusted the poten¬ 
tial. Here we discuss some attempts made to improve the SEGUE 
predictions. The most obvious discrepancy between model and data 
is the over-abundance of metal-rich ([Fe/H] > 0 dex) stars in the 
model. We wish to decrease the impact of the thin disc at interme¬ 
diate Galactic heights whilst retaining a dominant thin disc popula¬ 
tion in the GCS sample. We tried decreasing /3 Z by setting j3 z = 0.3 
and simultaneously decreasing cr-o.thn = 26kms~ 1 such that the 
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Figure 15. SEGUE metallicity and velocity distributions with a linear scale (left) and logarithmic (right): the top row shows the metallicity distribution, second 
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Figure 17. 2D histograms in the planes (\z\,v r j > ), ([Fe/H], v^) and ([Fe/H], |z|) - the coloured histogram shows the SEGUE G dwarf data and the black 
logarithmically-spaced contours are for the mock SEGUE G dwarf catalogue. The red and gold lines give the mean [Fe/H] in equal-width bins centred on the 
dots for the data and model. 


younger populations remained at fixed velocity dispersion whilst 
the older populations were made cooler and so less dominant with 
height. However, this did little to reduce the number of metal-rich 
stars in the SEGUE G dwarf sample. Additionally, we tried increas¬ 
ing -Ro-.thn to ~ 12 kpc such that the relative number of hot metal- 
rich thin disc stars from the inner Galaxy was reduced. Again, this 
did little to remedy the situation. Finally, we tried evaluating the 


velocity dispersions at the radius of a circular orbit with angular 
momentum | J 4 , \ + J z as this location better reflects the disc envi¬ 
ronment that heats the given star. This would cause the contribution 
of the thin disc to die off faster with Galactic height as velocity dis¬ 
persion falls with radius. However, as the scale-lengths of the ve¬ 
locity dispersions are long in our model this small adjustment does 
not significantly change the predictions. 
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Figure 18. 2D histograms in the plane of metallicity [Fe/H] and guiding-centre radius R c and ID [Fe/H] histograms for three bins in J z (top row: J z < 
22.3 km s —1 kpc, middle row: 22.3 krn 1 kpc < J z < 100 kms -1 kpc, and bottom row: J, > 100 km s —1 kpc). The left two plots show the SEGUE 
G dwarf data, whilst the right two show the mock SEGUE G dwarf catalogue. We show the gradient d[Fe/H]/dfi c for the samples in each bin only using data 
with 6 kpc < R c < 9 kpc and —2.5 dex < [Fe/H] <0.5 dex above the relevant plot, and the number of stars in each bin in the second and fourth column 
panels. The vertical line shows the location of the peak of the data metallicity distribution in the lowest action bin. 


Another deficiency of the model is the inability to match the 
trend of [Fe/H] with v<j, and the model does not produce the posi¬ 
tive gradients of d[Fe/H]/d_R c seen in Fig. 18. In order to improve 
the situation we tried breaking the assumption that the thick disc is 
a single quasi-isothermal and made the scale-length a linear func¬ 
tion of age from 1.5 kpc at the birth of the Galaxy to 2.5 kpc at 
r = tt- This model assumes the Galaxy formed inside-out. This 
did not improve the run of v 4 , with [Fe/H] and actually produced 
more metal-rich thick disc stars in the sample, and thus shifted the 
model metallicity distribution to higher [Fe/H]. One problem is 
that the strong radial migration broadens the thick disc so the kine¬ 
matic signal of inside-out formation is washed out and we are left 
with the chemical signature: a higher abundance of metal-rich stars. 


7 CONCLUSIONS 

We have presented a simple extension of the action-based distribu¬ 
tion functions of B12 to include metallicity information. Our model 
is the first continuous chemo-dynamical model in the literature and 
is constructed such that the spatial and kinematic distributions are 
consistent with a realistic Galactic potential. Inspired by the full 
chemo-dynamical evolution models of SB09, we included analyti¬ 
cally the relationship between metallicity and dynamics. The model 
explicitly includes a simple, but novel, radial migration prescription 
that causes the solar metallicity distribution to broaden as stars mi¬ 
grate to Ro from the inner and outer disc. Despite the simplicity of 
our radial migration prescription we have shown that it preserves 
the radial profile of the disc and hence the total angular momentum 
of the system. 

All surveys have a selection function that restricts apparent 


magnitude. The luminosity of a star depends on its initial mass, 
its metallicity and its age according to the appropriate isochrone. 
By selecting on apparent magnitude, we select on a combination 
of mass, age and metallicity, and we need an EDF to model this 
selection, as we must before we can properly compare a model to 
data. 

We showed that selecting solar-neighbourhood stars on appar¬ 
ent magnitude alone, implies a strong selection on age. Because 
young stars have smaller random velocities than old stars, by pref¬ 
erentially selecting young stars, an apparent-magnitude cut returns 
a sample with a materially smaller velocity dispersion than that 
of the underlying population. This effect, which makes the GCS 
stars younger than the local population, has been neglected in re¬ 
cent studies that employed a DF, such as B12 and Piffl et al. (2014). 

Fortunately, the requirement to use an EDF rather than a DF is 
an opportunity as well as a necessity. The opportunity has two parts. 
First, an EDF should to lead to tighter constraints on the Galac¬ 
tic potential because each observationally distinguishable chemical 
population has its own dynamics yet must be consistent with the 
same Galactic potential. Walker & Penarrubia (2011) have demon¬ 
strated the value of this principle in the context of two populations 
within dwarf spheroidal galaxies, but our Galaxy, with its rich array 
of populations, offers more exciting possibilities. Similarly, Bovy 
& Rix (2013) combined constraints on the Galactic potential ob¬ 
tained by modelling independently the dynamics of 43 chemically 
distinct samples of SEGUE G dwarfs. 

The need to work with EDFs is also an opportunity in rela¬ 
tion to Galactic archaeology. Chemistry indicates where a star was 
born, which, when combined with its current kinematics, provides 
information about the history of the star and the Galaxy. 
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We fitted the parameters of our EDF to the GCS and the 
Gilmore & Reid (1983) stellar density curve under the assumption 
of an arbitrary, but quite realistic Galactic potential. The parameters 
of the fitted EDF suggest that stars migrate widely in radius in the 
course of even 6 Gyr. With the EDF we sampled a mock GCS cat¬ 
alogue and compared that to the data. The model provided a good 
fit to the GCS data although it is axisymmetric and as such unable 
to reproduce the rich velocity-space substructure of the GCS. We 
demonstrated the importance of the selection function when mod¬ 
elling the GCS, which suppresses the number of old, metal-poor 
stars such that the thick disc’s contribution to the dataset is less 
than its true contribution to the solar neighbourhood. Additionally, 
we recover the observed spatial and kinematic metallicity gradients 
in the Solar neighbourhood. 

The quality of the fits we obtain to the GCS data is very en¬ 
couraging for it shows that the adopted functional form for the EDF 
is sufficiently flexible to capture the chemo-dynamics of the imme¬ 
diate solar neighbourhood while reproducing the star-count data of 
Gilmore & Reid (1983). This is a non-trivial feat and one that could 
not be accomplished even using a functional form for the EDF that 
encompassed the truth if we had adopted a Galactic potential that 
was seriously in error. 

A natural next step would have been to fit the EDF to the data 
set formed by the GCS and SEGUE survey taken together. How¬ 
ever, we did not pursue this course, but chose instead to use the 
EDF fitted to the GCS alone to predict the SEGUE data. This was a 
bold step to take because the thick disc and halo, which dominate 
the SEGUE data, do not contribute much to the GCS, so the predic¬ 
tions are determined by the noisy tails of the GCS data. Moreover, 
the SEGUE stars are located far from the plane, so the predictions 
are sensitive to the structure of the adopted Galactic potential, and 
it is not clear that the metallicity scales of the GCS and SEGUE 
coincide. It follows that even moderate agreement between the pre¬ 
dictions and the SEGUE data must be counted strong endorsement 
of our functional form for the EDF because noise in the GCS, errors 
in the potential and discrepant metallicity scales will all ensure that 
perfect predictions were not obtained even with the functional form 
that encompassed the truth. 

The EDF fitted to the GCS predicts the velocity distributions 
of the data with considerable success, the main shortcoming being 
a distribution in v<j> that is shifted by ~ 8 km s” 1 to lower v# with 
respect to the SEGUE data. Its prediction of the metallicity distri¬ 
bution is less successful because the predicted abundance of stars 
with [Fe/H] ~ — 0.5dex is too low and the predicted abundance 
of stars with [Fe/H] > 0 is too high. The fault may not lie with 
the model - Schonrich & Bergemann (2014) argue that there is a 
bias in the SEGUE stellar parameter pipeline that causes an arti¬ 
ficial build-up of stars at [Fe/H] ss —0.5dex. Nonetheless, it is 
possible that our model of the thick disc is too simple, so we tried 
a number of small fixes to improve the SEGUE predictions. These 
experiments indicated that with our adopted potential it is difficult 
to have a large number of metal-rich stars in the GCS sample and 
very few in the SEGUE sample. Hence we tentatively conclude that 
the fault lies more with the SEGUE data than our model. 

7.1 The thick disc 

Although our thick disc has an extended (2 Gyr) period of forma¬ 
tion, during which its chemistry evolved rapidly, we have retained 
B12's assumption that the thick disc is a single quasi-isothermal. 
No physical principle underlays this assumption, it was just the 
simplest assumption to make about a component of the Galaxy that 


did not contribute largely to the GCS and about which rather little 
was known. If we are to retain the thick disc, our model of it should 
probably be more elaborate. 

Before hastening to develop an elaborate model of the thick 
disc, we should re-examine the case for the very existence of the 
thick disc. Our model includes a distinct thick disc as we require a 
distinct population with a high vertical velocity dispersion to match 
the Gilmore & Reid (1983) data at large Galactic heights. How¬ 
ever, although we have included it as a distinct population in our 
model, the thick disc does not stand out in the SEGUE G dwarf 
mock sample. Bovy et al. (2012a,c) argued from the same sam¬ 
ple in a different way that there is no evidence of the expected 
thin/thick dichotomy, and concluded that our Galaxy’s disc extends 
seamlessly from an old, a-rich wing to a young a-poor wing. If 
the thick disc can be defined in an intellectually satisfying way, 
it must be defined through chemistry (e.g. Binney & Merrifield 
1998, §10.4), and strong believers in the thin/disc dichotomy ar¬ 
gue that the distribution of stars in the ([Fe/H], [a/Fe]) plane is 
bimodal (Fuhrmann 2011; Recio-Blanco et al. 2014), while Bovy 
et al. (2012a,c) argue that this bimodality is a selection effect. More 
recently, Anders et al. (2014) and Nidever et al. (2014) have shown 
that the APOGEE data points to the existence of a bimodality in 
the ([Fe/H], [a/Fe]) plane. In our view this remains an open ques¬ 
tion, and one that is best resolved by combining EDFs with ongoing 
surveys such as the Gaia-ESO. 

If we are to have a distinct thick disc, Section 6.2.4 makes 
it clear that there must be correlations between the chemistry and 
kinematics of stars. The failures of our EDF arise because it re¬ 
stricts these correlations to the thin disc. The observed increase of 
guiding-centre radius with metallicity, even at high J z , suggests 
that the oldest, most metal-poor stars formed only at small radii 
and reach us only at the apocentres of eccentric orbits. We could 
provide for an effect along these lines by making the thick disc, 
like the thin disc, a superposition of coeval quasi-isothermals with 
velocity-dispersion parameters that increase with age. Success of 
this modification in reproducing the SEGUE data might reason¬ 
ably be interpreted as strong support for the contention that the 
thin/thick dichotomy is artificial. 

If the modification just suggested were not entirely successful, 
two departures from the assumptions of SB09 that are implicit in 
our EDF might be required to restrict metal-poor star formation to 
the inner Galaxy: (i) an increase over time in the scale length Rd of 
the star-formation rate, and (ii) a vigorous outward flow of metals 
to restrict the production of very metal-poor stars with high angu¬ 
lar momentum. The SB09 model does provide for metals to pass 
into virial-temperature gas that can flow outwards, but the domi¬ 
nant metal flow is within the disc and directed inward. 


7.2 Future work 

Our EDF should be extended to include the [a/Fe] abundances. 
[a/Fe] is an invaluable surrogate for the age of a star, and the 
([a/Fe], [Fe/H]) plane is widely used to disentangle different 
Galactic populations (Bensby et al. 2007; Bovy et al. 2012c, SB09). 
Valuable next steps in EDF modelling would be 

(i) A simultaneous fit of the EDF to the GCS and SEGUE data sets 
in a given Galactic potential. 

(ii) A simultaneous fit of the EDF and the Galactic potential to the 
GCS and SEGUE data sets. 

(iii) Fits of the EDF to the RAVE (Kordopatis et al. 2013; Boeche 
et al. 2011) and Gaia-ESO (Gilmore et al. 2012) data. Both sur- 
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veys provide ~ 10 s stars at intermediate distances from the Galac¬ 
tic plane, so nicely complement the thin-disc-dominated view from 
the GCS and the thick-disc dominated view from the SEGUE sur¬ 
vey. Binney et al. (2014) have shown that the DFs provide a good 
account of the RAVE data, whilst Piffl et al. (2014) used the RAVE 
stars to constrain the Galactic potential. These analyses essentially 
assumed the selection function of the RAVE survey is uniform in 
age, and they did not use the metallicity information. The EDF 
should be used to repeat this analysis to constrain simultaneously 
the Galactic potential and uncover signatures of evolutionary pro¬ 
cesses in the Galaxy, such as radial migration. The Gaia-ESO sur¬ 
vey provides higher-resolution spectra and more accurate metallic - 
ities than RAVE, and can see fainter stars. The constraints from 
these two surveys should be highly complementary. 

Our EDF is inspired by one particular model of the Galaxy’s 
chemo-dynamical evolution. It would be interesting to see how well 
it fits other chemo-dynamical models, and potentially to modify the 
functional form to produce one that provides good fits to several 
different models. The parameter values required to fit each model 
would then serve to place the models in a well defined space. Fits 
to observational data would allow us to locate our Galaxy in this 
space. 

7.3 Interpretation of the models 

In the coming years, Galaxy models will inevitably be important 
for scientific exploitation of Galaxy surveys because by far the best 
way to allow for the huge impact of selection effects on any Galaxy 
survey is to “observe” a model with realistic selection biases. More¬ 
over, a model enables us to make proper allowance for the impact 
of measurement errors on data. 

Models of three very different types will play roles. There are 
/V-body simulations of galaxy formation and chemical evolution 
(e.g. Brook et al. 2012; Tissera et al. 2012; Gibson et al. 2013), 
models of chemo-dynamical evolution that follow star formation 
and chemical enrichment in a gas disc that is decomposed into 
chemically homogeneous annuli (Colavitti et al. 2008; Schonrich & 
Binney 2009; Wang & Zhao 2013), and there are models with an¬ 
alytic EDFs. Additionally, there are the hybrid models of Minchev 
et al. (2013, 2014) that sit between the first two types of models 
mentioned in that they combine W-body simulations of the dynam¬ 
ics with a annulus-based chemical-evolution model. The goal of 
an EDF model is more modest than the goals of the other mod¬ 
els in that it seeks to describe how the Galaxy is configured at the 
present time, without predicting how it arrived at this state. So it 
is descriptive rather than explanatory. From this modesty we gain 
tractability in the sense that an EDF model can be most easily ad¬ 
justed to fit it to observational data. Then the model encapsulates 
in an intuitive way what the data have to say about how the Galaxy 
is currently structured. In fact, we believe EDF models will provide 
a valuable interface between observational data and cosmological 
simulations: an EDF fitted to the endpoint of a simulation will pro¬ 
vide a summary of the content of that simulation, and comparison 
of this EDF with one fitted to the observational data will indicate in 
what respects the simulation is more and less successful. 

Our models are designed to describe the present state of the 
Galaxy, which must be pinned down in advance of fruitful specula¬ 
tion as to what evolutionary processes set it up. Our adopted func¬ 
tional form of the EDF is, however, inspired by a particular model of 
the Galaxy’s historical development. It is tempting, therefore, to in¬ 
terpret the best-fitting parameters of the EDF as revealing something 


about our Galaxy’s history. We caution against over-interpretation 
of the models - they are descriptive in nature, and designed to en¬ 
capsulate our Galaxy’s current structure. Once the current state of 
the Galaxy is known, we can begin asking questions about how it 
may have reached this state. 
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The full EDF is 


f{jR, J<t>, Jz, [Fe/H], t) = 




, [1 + tanh (J^/Lq)] 


2 t 1 + erf ({ J ^ + D £ )r }/V2 (tl)] 

1 r 


V 2na l 


exp 


2al 


2Q C (J^,) _ 

X 8n3R d, a K2 ( J '<!>) 

y pxn [ — _ v{J*)Jz 1 

' P L R d}CC (7P, a (J 0 ) 
x 8{[Fe/R] - F(R' c ,t)}. 


(Al) 


This EDF assumes that all the heating occurred at the current angu¬ 
lar momentum (i.e. ay and ay are functions of J©. Note the error 
function in the denominator, and the arguments of the epicyclic fre¬ 
quencies. 

To integrate the EDF, we carry out the following steps for each 
component a: 


(i) Perform integral over [Fe/H]: integrates to one if R' c > 0 and 

T < T m - 

(ii) Integrate over Jr, J z from 0 to oo: Exponentials produce fac¬ 
tors <Tr /f/' > and a that cancel with part of the fraction. 


(iii) Integrate over J 0 from — oo to oo: The only terms that now de¬ 
pend upon Jy are the tanh and the Gaussian. The tanh restricts the 
integration limits to 0 to oo such that the integral over the Gaussian 
is given by the error function term in the denominator so cancels. 

(iv) Change integration variable from J to R' c : 

This piece cancels with the appropriate terms in the fraction. Again 
the integral is from — oo to oo but this time we don't have a tanh 
to cancel out the negative piece. However, all stars are born in the 
disc with positive angular momentum, so the negative birth angular 
momenta are forbidden. We are left with the integral 


f 


dR' c f(r,R' c ) = 


dR'c r(r) 


R'c -R'JR d 

8n 3 R 2 d 


(A2) 


which integrates to f(r) = r(r)/87t 3 . 

(v) Integration over the three angle variables removes the factor 
87t 3 . 


Therefore, for each component we are left with F a (t) and 
F(r) = ]Tr Q (T). (A3) 

OL 

T(t) has been chosen to normalize to unity (equation (10)) so the 
EDF is normalized. 


APPENDIX A: EDF NORMALIZATION 

Here we show that the EDF presented in Section 2 is correctly nor¬ 
malized i.e 

J d 3 * d 3 it dj'^ d[Fe/H] dr f(x, v , [Fe/H], t) 

= (2n) 3 J d 3 J dJ^, d[Fe/H] dr /(J, J^, [Fe/H], r) = 1. 
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